Scene Understanding through Autonomous Interactive Perception
نویسندگان
چکیده
We propose a framework for detecting, extracting and modeling objects in natural scenes from multi-modal data. Our framework is iterative, exploiting different hypotheses in a complementary manner. We employ the framework in realistic scenarios, based on visual appearance and depth information. Using a robotic manipulator that interacts with the scene, object hypotheses generated using appearance information are confirmed through pushing. The framework is iterative, each generated hypothesis is feeding into the subsequent one, continuously refining the predictions about the scene. We show results that demonstrate the synergic effect of applying multiple hypotheses for real-world scene understanding. The method is efficient and performs in real-time.
منابع مشابه
IQA: Visual Question Answering in Interactive Environments
We introduce Interactive Question Answering (IQA), the task of answering questions that require an autonomous agent to interact with a dynamic visual environment. IQA presents the agent with a scene and a question, like: “Are there any apples in the fridge?” The agent must navigate around the scene, acquire visual understanding of scene elements, interact with objects (e.g. open refrigerators) ...
متن کاملAre there interactive processes in speech perception?
Lexical information facilitates speech perception, especially when sounds are ambiguous or degraded. The interactive approach to understanding this effect posits that this facilitation is accomplished through bi-directional flow of information, allowing lexical knowledge to influence pre-lexical processes. Alternative autonomous theories posit feed-forward processing with lexical influence rest...
متن کاملThe ApolloScape Dataset for Autonomous Driving
Scene parsing aims to assign a class (semantic) label for each pixel in an image. It is a comprehensive analysis of an image. Given the rise of autonomous driving, pixel-accurate environmental perception is expected to be a key enabling technical piece. However, providing a large scale dataset for the design and evaluation of scene parsing algorithms, in particular for outdoor scenes, has been ...
متن کاملTowards Richer and Self-Supervised Perception in Robots
Robots need to be able to navigate through environments, perform manipulation tasks, and avoid obstacles, all while having a strong spatial and semantic understanding of its immediate environment. This thesis focuses on endowing robots with richer perceptual models for improved navigation and scene understanding. I focus on three fundamental elements that are imperative to robust perception in ...
متن کاملPerception and Reasoning for Scene Understanding in Human-Robot Interaction Scenarios
In this paper, a combination of perception modules and reasoning engines is used for scene understanding in typical Human-Robot Interaction(HRI) scenarios. The major contribution of this work lies in a 3D object detection, recognition and pose estimation module, which can be trained using CAD models and works for noisy data, partial views and in cluttered scenes. This perception module is combi...
متن کامل